Skip to content

feat(capture): v1 wire serialization transforms (capture v1, 2/6)#702

Draft
eli-r-ph wants to merge 2 commits into
capture-v1/01-configfrom
capture-v1/02-serialize
Draft

feat(capture): v1 wire serialization transforms (capture v1, 2/6)#702
eli-r-ph wants to merge 2 commits into
capture-v1/01-configfrom
capture-v1/02-serialize

Conversation

@eli-r-ph

Copy link
Copy Markdown

💡 Motivation and Context

Second PR in the stacked Capture V1 series (stacked on #701). Adds the pure, no-I/O serialization layer that turns a legacy-shaped queued message into a /i/v1/analytics/events wire event. Still inert — nothing calls these functions yet (transport + wiring come in later PRs).

New module posthog/capture_v1.py:

  • to_v1_event(msg) — dict-to-dict transform proven against the server contract (rust/capture/src/v1/analytics/types.rs) and posthog-go's capture_v1.go:
    • Lifts the sentinel properties into the typed options object, including the $ignore_sent_at -> disable_skew_correction rename.
    • Promotes $session_id/$window_id to top-level string fields.
    • Relocates top-level $set/$set_once into properties — v1 has no top-level form, and the legacy set()/set_once() builders emit them at the top level, so without this person-property updates would silently vanish.
    • Strips $lib/$lib_version (the server injects them from the required PostHog-Sdk-Info header).
    • Coerces options to native JSON types or omits them. v1 deserializes options strictly; a wrong type (e.g. cookieless_mode: "true") 400s the entire batch, so a bad value is dropped rather than forwarded.
    • Pure: the input message is not mutated (safe for retries/callbacks).
  • build_v1_batch_body(events, historical_migration) — the api_key/sent_at-free envelope with a tz-aware RFC3339 created_at (historical_migration omitted when false).
  • Shared constants: endpoint path, required header names, result codes, retryable/terminal status sets (used by the transport PR).

💚 How did you test it?

New posthog/test/test_capture_v1.py (47 cases, parameterized): bool/string coercion across all variants; per-sentinel lift + rename + coercion; bad-coercion omitted-but-removed; top-level string sentinels; options == {} when empty; $lib stripping; non-wire keys (e.g. type) not leaked; input-not-mutated; top-level $set/$set_once relocation incl. merge-precedence with an existing properties.$set; $groups left untouched; tz-aware/passthrough/default timestamps; envelope shape + RFC3339 created_at + historical_migration toggling.

ruff format/check clean; mypy clean on the new module; regenerated references/public_api_snapshot.txt.

📝 Checklist

  • I reviewed the submitted code.
  • I added tests to verify the changes.
  • I updated the docs if needed.
  • No breaking change (additive; transforms are not yet wired in).

🤖 Agent context

Autonomy: Human-driven (agent-assisted)

Authored with Cursor (Claude Opus 4.8) per the agreed plan. Architecture note: posthog-go builds v1 events from typed message structs, but posthog-python's queue holds already-assembled legacy dicts, so this is a dict-to-dict transform applied at send time — the wire shape and sentinel table are identical to go's. The strict-type/coerce-or-omit and $set relocation behaviors were verified directly against the Rust Event/Options structs and their serde tests.

@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(capture): add v1 wire serialization..." | Re-trigger Greptile

Comment thread posthog/capture_v1.py Outdated
Comment thread posthog/capture_v1.py Outdated
Comment thread posthog/capture_v1.py Outdated
Comment thread posthog/capture_v1.py Outdated
@github-actions

github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

posthog-python Compliance Report

Date: 2026-06-28 00:55:07 UTC
Duration: 530108ms

✅ All Tests Passed!

45/45 tests passed


Capture Tests

29/29 tests passed

View Details
Test Status Duration
Format Validation.Event Has Required Fields 517ms
Format Validation.Event Has Uuid 10007ms
Format Validation.Event Has Lib Properties 10007ms
Format Validation.Distinct Id Is String 10006ms
Format Validation.Token Is Present 10007ms
Format Validation.Custom Properties Preserved 10007ms
Format Validation.Event Has Timestamp 10006ms
Retry Behavior.Retries On 503 18019ms
Retry Behavior.Does Not Retry On 400 12004ms
Retry Behavior.Does Not Retry On 401 10006ms
Retry Behavior.Respects Retry After Header 16013ms
Retry Behavior.Implements Backoff 30018ms
Retry Behavior.Retries On 500 13009ms
Retry Behavior.Retries On 502 16010ms
Retry Behavior.Retries On 504 16010ms
Retry Behavior.Max Retries Respected 30017ms
Deduplication.Generates Unique Uuids 7002ms
Deduplication.Preserves Uuid On Retry 16015ms
Deduplication.Preserves Uuid And Timestamp On Retry 23018ms
Deduplication.Preserves Uuid And Timestamp On Batch Retry 16004ms
Deduplication.No Duplicate Events In Batch 10002ms
Deduplication.Different Events Have Different Uuids 10007ms
Compression.Sends Gzip When Enabled 10007ms
Batch Format.Uses Proper Batch Structure 10006ms
Batch Format.Flush With No Events Sends Nothing 5005ms
Batch Format.Multiple Events Batched Together 10005ms
Error Handling.Does Not Retry On 403 12008ms
Error Handling.Does Not Retry On 413 10007ms
Error Handling.Retries On 408 14013ms

Feature_Flags Tests

16/16 tests passed

View Details
Test Status Duration
Request Payload.Request With Person Properties Device Id 9500ms
Request Payload.Flags Request Uses V2 Query Param 10007ms
Request Payload.Flags Request Hits Flags Path Not Decide 10006ms
Request Payload.Flags Request Omits Authorization Header 10007ms
Request Payload.Token In Flags Body Matches Init 10007ms
Request Payload.Groups Round Trip 10006ms
Request Payload.Groups Default To Empty Object 10006ms
Request Payload.Person Properties Distinct Id Auto Populated When Caller Omits It 10007ms
Request Payload.Disable Geoip False Propagates As Geoip Disable False 10006ms
Request Payload.Disable Geoip Omitted Defaults To False 10006ms
Request Payload.Flag Keys To Evaluate Contains Only Requested Key 10007ms
Request Lifecycle.No Flags Request On Init Alone 5003ms
Request Lifecycle.No Flags Request On Normal Capture 10507ms
Request Lifecycle.Two Flag Calls Produce Two Remote Requests 9510ms
Request Lifecycle.Mock Response Value Is Returned To Caller 10002ms
Side Effect Events.Get Feature Flag Captures Feature Flag Called Event 10510ms

Add posthog/capture_v1.py with the pure (no-I/O) transform layer for
/i/v1/analytics/events:

- to_v1_event(): lifts sentinel properties into the typed options object
  (with the $ignore_sent_at -> disable_skew_correction rename), promotes
  $session_id/$window_id to top-level fields, relocates top-level
  $set/$set_once into properties (v1 has no top-level form), and strips
  $lib/$lib_version (server injects them from PostHog-Sdk-Info). Options are
  coerced to native JSON types or omitted, since a wrong type would 400 the
  whole batch. Pure: the input message is not mutated.
- build_v1_batch_body(): the api_key/sent_at-free envelope with a tz-aware
  RFC3339 created_at.
- Shared constants: path, required header names, result codes, retryable/
  terminal status sets.

Stacked on the capture_mode scaffolding; still inert (nothing calls these yet).
@eli-r-ph eli-r-ph force-pushed the capture-v1/02-serialize branch from 41c6948 to d6d4aa2 Compare June 28, 2026 00:20
@eli-r-ph eli-r-ph force-pushed the capture-v1/01-config branch from b508cf9 to b139d02 Compare June 28, 2026 00:20
@eli-r-ph eli-r-ph self-assigned this Jun 28, 2026
Hold the coercer function directly in _OPTION_SENTINELS instead of a
stringly-typed name keyed through a side _COERCERS dict, removing a
KeyError foot-gun and tightening the types.
@eli-r-ph eli-r-ph force-pushed the capture-v1/02-serialize branch from d6d4aa2 to 4ee420a Compare June 28, 2026 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant